-
Notifications
You must be signed in to change notification settings - Fork 6.1k
[GGUF] feat: support loading diffusers format gguf checkpoints. #11684
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
pip install git+https://github.com/huggingface/diffusers.git@refs/pull/11684/head from typing import List
import torch
import PIL.Image
from diffusers import AutoencoderKLWan, WanVACEPipeline, WanVACETransformer3DModel
from diffusers.schedulers.scheduling_unipc_multistep import UniPCMultistepScheduler
from diffusers.utils import export_to_video, load_image, load_video
from diffusers import GGUFQuantizationConfig
model_id = "a-r-r-o-w/Wan-VACE-1.3B-diffusers"
transformer_path = f"https://huggingface.co/newgenai79/Wan-VACE-1.3B-diffusers-gguf/blob/main/Wan-VACE-1.3B-diffusers-Q8_0.gguf"
transformer_gguf = WanVACETransformer3DModel.from_single_file(
transformer_path,
quantization_config=GGUFQuantizationConfig(compute_dtype=torch.bfloat16),
torch_dtype=torch.bfloat16,
config=model_id,
subfolder="transformer",
)
vae = AutoencoderKLWan.from_pretrained(model_id, subfolder="vae", torch_dtype=torch.float32)
pipe = WanVACEPipeline.from_pretrained(
model_id,
transformer=transformer_gguf,
vae=vae,
torch_dtype=torch.bfloat16
)
flow_shift = 3.0 # 5.0 for 720P, 3.0 for 480P
pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config, flow_shift=flow_shift)
pipe.enable_model_cpu_offload()
pipe.vae.enable_tiling()
prompt = "A sleek, humanoid robot stands in a vast warehouse filled with neatly stacked cardboard boxes on industrial shelves. The robot's metallic body gleams under the bright, even lighting, highlighting its futuristic design and intricate joints. A glowing blue light emanates from its chest, adding a touch of advanced technology. The background is dominated by rows of boxes, suggesting a highly organized storage system. The floor is lined with wooden pallets, enhancing the industrial setting. The camera remains static, capturing the robot's poised stance amidst the orderly environment, with a shallow depth of field that keeps the focus on the robot while subtly blurring the background for a cinematic effect."
negative_prompt = "Bright tones, overexposed, static, blurred details, subtitles, style, works, paintings, images, static, overall gray, worst quality, low quality, JPEG compression residue, ugly, incomplete, extra fingers, poorly drawn hands, poorly drawn faces, deformed, disfigured, misshapen limbs, fused fingers, still picture, messy background, three legs, many people in the background, walking backwards"
output = pipe(
prompt=prompt,
negative_prompt=negative_prompt,
width=832,
height=480,
num_frames=81,
num_inference_steps=30,
guidance_scale=5.0,
conditioning_scale=0.0,
generator=torch.Generator().manual_seed(0),
).frames[0]
export_to_video(output, "output.mp4", fps=16)
|
Any suggestions on above issue, pls. |
I am not sure this PR supports Wan yet. |
Would be better to add a utility function def _should_convert_state_dict_to_diffusers(model_state_dict, checkpoint_state_dict):
return not set(model_state_dict.keys()).issubset(set(checkpoint_state_dict.keys()) to If condition passes, convert the checkpoint with this line
if not set |
@DN6 I think we discussed that in the conversion we will embed metadata to the GGUF file. This is now supported: ngxson/diffusion-to-gguf#3. Would you be able to make changes to this PR to see if that works? |
What does this PR do?
Refer to ngxson/diffusion-to-gguf#1 to know how to obtain the checkpoint.
After the checkpoint is obtained, run the following code for inference:
Expand
Currently, the entrypoint for the diffusers formatted GGUF checkpoint is through
from_single_file()
. It remains to be seen if after https://github.com/ngxson/flux-to-gguf, we wanna support them throughfrom_pretrained()
.Sample
diffusers
-format GGUF file: https://huggingface.co/sayakpaul/flux-diffusers-gguf@DN6 please feel free to make any changes or even change the direction of the PR as you see fit.